NumPy

Numpy is the core library for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. Note that the terms array, NumPy array, or ndarray all refer to the same thing: the ndarray object.


In [ ]:
import math
import numpy as np

''' Display Precicion Settings '''

np.set_printoptions(formatter={'float': '{: 0.4f}'.format})

Creating ndarrays


In [ ]:
data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
arr1

In [ ]:
data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

In [ ]:
arr2.ndim

In [ ]:
arr2.shape

Unless explicitly specified np.array tries to infer a good data type for the array that it creates. The data type is stored in a special dtype object:


In [ ]:
arr1.dtype

In [ ]:
arr2.dtype

In addition to np.array, there are a number of other functions for creating new arrays:.

Method Description
array Convert input data (list, tuple, array, or other sequence type) to an ndarray either by inferring a dtype or explicitly specifying a dtype.
Copies the input data by default.
asarray Convert input to ndarray, but do not copy if the input is already an ndarray.
arange Like the built-in range but returns an ndarray instead of a list.
ones, ones_like Produce an array of all 1’s with the given shape and dtype.
ones_like takes another array and produces a ones array of the same shape and dtype.
zeros, zeros_like Like ones and ones_like but producing arrays of 0’s instead.
empty, empty_like Create new arrays by allocating new memory, but do not populate with any values like ones and zeros.
eye, identity Create a square N x N identity matrix (1’s on the diagonal and 0’s elsewhere).

To create a higher dimensional array with these methods, pass a tuple for the shape.


In [ ]:
np.zeros(10)

In [ ]:
np.zeros((3,6))

In [ ]:
np.ones((2,2))

In [ ]:
np.eye(3,3)

The identity alternative to eye takes only one value:


In [ ]:
np.identity(2)

empty creates an array without initializing its values to any particular value. It is not safe to assume that np.empty will return an array of all zeros:


In [ ]:
np.empty((2, 3, 2))

In many cases it will return uninitialized garbage values.

arange is an array-valued version of the built-in Python range function:


In [ ]:
np.arange(15)

Data Types for ndarrays

The data type or dtype is a special object containing the information the ndarray needs to interpret a chunk of memory as a particular type of data:


In [ ]:
arr1 = np.array([1, 2, 3], dtype=np.float64)
arr1.dtype

In [ ]:
arr2 = np.array([1, 2, 3], dtype=np.int32)
arr2.dtype

Dtypes are part of what make NumPy so powerful and flexible.

It is often only necessary to care about the general kind of data you are dealing with, whether floating point, complex, integer, boolean, string, or general Python object.

When you need more control over how data are stored in memory and on disk, especially large data sets, it is good to know that you have control over the storage type.

Table: NumPy data types

Type Type Code Description
int8, uint8 i1, u1 Signed and unsigned 8-bit (1 byte) integer types
int16, uint16 i2, u2 Signed and unsigned 16-bit integer types
int32, uint32 i4, u4 Signed and unsigned 32-bit integer types
int64, uint64 i8, u8 Signed and unsigned 32-bit integer types
float16 f2 Half-precision floating point
float32 f4 or f Standard single-precision floating point. Compatible with C float
float64 f8 or d Standard double-precision floating point. Compatible with C double and Python float object
float128 f16 or g Extended-precision floating point
complex64, complex128, complex256 c8, c16, c32 Complex numbers represented by two 32, 64, or 128 floats, respectively
bool ? Boolean type storing True and False values
object O Python object type
string_ S Fixed-length string type (1 byte per character). For example, to create a string dtype with length 10, use 'S10'.
unicode_ U Fixed-length unicode type (number of bytes platform specific). Same specification semantics as string_ (e.g. 'U10').

In most cases they map directly onto an underlying machine representation, which makes it easy to read and write binary streams of data to disk and also to connect to code written in a low-level language like C or Fortran.

The numerical dtypes are named the same way: a type name, like float or int, followed by a number indicating the number of bits per element. A standard double-precision floating point value (what’s used under the hood in Python’s float object) takes up 8 bytes or 64 bits. Thus, this type is known in NumPy as float64.

You can explicitly convert or cast an array from one dtype to another using ndarray’s astype method:


In [ ]:
arr = np.array([1, 2, 3, 4, 5])
arr.dtype

In [ ]:
float_arr = arr.astype(np.float64)
float_arr.dtype

In [ ]:
float_arr

In this example, integers were cast to floating point. If I cast some floating point numbers to be of integer dtype, the decimal part will be truncated:


In [ ]:
arr = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])
arr

In [ ]:
arr.astype(np.int32)

Should you have an array of strings representing numbers, you can use astype to convert them to numeric form:


In [ ]:
numeric_strings = np.array(['1.25', '-9.6', '42'], dtype=np.string_)
numeric_strings.astype(float)

Note how NumPy magic mapped the lazy input of a Python float type (instead of np.float64 say) to the equivalent dtype.

If casting were to fail for some reason (like a string that cannot be converted to float64), a TypeError will be raised.

Another array's dtype can also be used:


In [ ]:
int_array = np.arange(10)
int_array

In [ ]:
calibers = np.array([.22, .270, .357, .380, .44, .50], dtype=np.float64)
int_array.astype(calibers.dtype)

There are shorthand type code strings for refering to a dtype (see table above):


In [ ]:
empty_uint32 = np.empty(8, dtype='u4')
empty_uint32

Note:

– Calling astype always creates a new array (a copy of the data), even if the new dtype is the same as the old dtype.

– Keep in mind that floating point numbers, such as those in float64 and float32 arrays, are only capable of approximating fractional quantities. In complex computations, floating point errors may accrue, making comparisons only valid up to a certain number of decimal places.

Operations between Arrays and Scalars

Arrays are important because they enable you to express batch operations on data without writing any for loops. This is usually called vectorization. Any arithmetic operations between equal-size arrays applies the operation elementwise:


In [ ]:
arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr

In [ ]:
arr * arr

In [ ]:
arr - arr

Arithmetic operations with scalars are as you would expect, propagating the value to each element:


In [ ]:
1/arr

In [ ]:
arr ** (.5)

Operations between differently sized arrays is called broadcasting and is not covered here.

Basic Indexing and Slicing

NumPy has rich array indexing possibilities is a rich topic as there are many ways to select a subset of array data or individual elements. One-dimensional arrays are simple; on the surface they act similarly to Python lists:


In [ ]:
arr = np.arange(10)
arr

In [ ]:
arr[5]

In [ ]:
arr[5:8]

An important first distinction from lists is that array slices are views on the original array; that is, the data is not copied, and any modifications to the view will be reflected in the source array, e.g. if a scalar value is assigned to a the value is propagated (broadcasted) to the entire selection:


In [ ]:
arr[5:8] = 12
arr

In [ ]:
arr_slice = arr[5:8]
arr_slice[1] = 12345
arr

In [ ]:
arr_slice[:] = 64
arr

This is may seem surprising, many other programming languages will copy data more zealously. But NumPy has been designed with large data use cases in mind, and this feature gives a wholly different performance, avoiding memory problems if NumPy had insisted on copying data left and right.

If you want a copy of a slice of an ndarray instead of a view, you will need to explicitly copy the array:


In [ ]:
arr[5:8].copy()

With higher dimensional arrays, you have many more options. In a two-dimensional array, the elements at each index are no longer scalars but rather one-dimensional arrays:


In [ ]:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d[2]

Thus, individual elements can be accessed recursively. But that is a bit too much work, so you can pass a comma-separated list of indices to select individual elements. So these are equivalent:


In [ ]:
arr2d[0][2]

In [ ]:
arr2d[0, 2]

In multidimensional arrays, if you omit later indices, the returned object will be a lower-dimensional ndarray consisting of all the data along the higher dimensions:


In [ ]:
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
arr3d

Both scalar values and arrays can be assigned to arr3d[0]:


In [ ]:
old_values = arr3d[0].copy()
arr3d[0].copy()

In [ ]:
arr3d[0] = 42
arr3d

In [ ]:
arr3d[0] = old_values
arr3d

Similarly, arr3d[1, 0] gives you all of the values whose indices start with (1, 0), forming a 1-dimensional array:


In [ ]:
arr3d[1, 0]

Indexing with slices

Like one-dimensional objects such as Python lists, ndarrays can be sliced using the familiar syntax:


In [ ]:
arr[1:6]

Higher dimensional objects give you more options as you can slice one or more axes and also mix integers. Consider the 2D array above, arr2d. Slicing this array is a bit different:


In [ ]:
arr2d

In [ ]:
arr2d[:2]

In [ ]:
arr2d[:,2]

In [ ]:
arr2d[:2, 1:]

In [ ]:
arr2d[1, :2]

In [ ]:
arr2d[2, :1]

In [ ]:
arr2d[:, :1]

And again, assigning to a slice expression assigns to the whole selection:


In [ ]:
arr2d[:2, 1:]

In [ ]:
arr2d[:2, 1:] = 0
arr2d[:2, 1:]

In [ ]:
arr2d

Boolean Indexing


In [ ]:
names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])
names

In [ ]:
data = np.random.randn(7, 4)
data

(The randn function in numpy.random generates normally distributed random data.)

Suppose we wanted to select all the rows that correspond to the name 'Bob'.

Like arithmetic operations, comparisons (such as ==) with arrays are also vectorized. Thus, comparing names with the string 'Bob' yields a boolean array, which can be passed when indexing the array:


In [ ]:
names == 'Bob'

In [ ]:
data[names == 'Bob']

The boolean array must be of the same length as the axis it is indexing.

You can even mix and match boolean arrays with slices or integers (or sequences of integers, more on this later):


In [ ]:
data[names == 'Bob', 2:]

In [ ]:
data[names == 'Bob', 3]

In [ ]:
data[names != 'Bob']

Selecting two of the three names to combine multiple boolean conditions, use boolean arithmetic operators like & (and) and | (or):


In [ ]:
mask = (names == 'Bob') | (names == 'Will')
mask

In [ ]:
data[mask]

Selecting data from an array by boolean indexing always creates a copy of the data, even if the returned array is unchanged.

Note: The Python keywords and and or do not work with boolean arrays.

Setting values with boolean arrays works in a common-sense way. To set all of the negative values in data to 0 we need only do:


In [ ]:
data[data < 0] = 0
data

Setting whole rows or columns using a 1D boolean array:


In [ ]:
data[names != 'Joe'] = 7
data

Fancy Indexing

Fancy indexing is a term adopted by NumPy to describe indexing using integer arrays.


In [ ]:
arr = np.empty((8, 4))

for i in range(8):
    arr[i] = i

arr

To select out a subset of the rows in a particular order, simply pass a list:


In [ ]:
arr[[4, 3, 0, 6]]

In [ ]:
arr[[-3, -5, -7]]

Passing multiple index arrays does something slightly different; it selects a 1D array of elements corresponding to each tuple of indices:


In [ ]:
arr = np.arange(32).reshape((8, 4))
arr

In [ ]:
arr[[1, 5, 7, 2], [0, 3, 1, 2]]

That is, it works like a coordinate system for picking elements: the elements $(1, 0)$, $(5, 3)$, $(7, 1)$, and $(2, 2)$ were selected.

Instead, one way to obtain the rectangular region formed by selecting a subset of the matrix’s rows and columns is:


In [ ]:
arr[[1, 5, 7, 2]][:, [0, 3, 1, 2]]

Another way is to use the np.ix_ function, which converts two 1D integer arrays to an indexer that selects the square region:


In [ ]:
arr[np.ix_([1, 5, 7, 2], [0, 3, 1, 2])]

Fancy indexing, unlike slicing, always copies the data into a new array.

Transposing Arrays and Swapping Axes

Arrays have the transpose method and also the special T attribute:


In [ ]:
arr = np.arange(15).reshape((3, 5))
arr, arr.T

To compute the inner matrix product $X^T X$:


In [ ]:
arr = np.random.randn(6, 3)
np.dot(arr.T, arr)

For higher dimensional arrays, transpose will accept a tuple of axis numbers to permute the axes:


In [ ]:
arr = np.arange(16).reshape((2, 2, 4))
arr

In [ ]:
arr.transpose((1, 0, 2))

Simple transposing with .T is just a special case of swapping axes. ndarray has the method swapaxes which takes a pair of axis numbers:


In [ ]:
arr

In [ ]:
arr.swapaxes(1, 2)

swapaxes similarly returns a view on the data without making a copy.

Universal Functions: Fast Element-wise Array Functions

A universal function, or ufunc, is a function that performs elementwise operations on data in ndarrays, like a fast vectorized wrappers for basic functions that take one or more scalar values and produce one or more scalar results.


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:
m = np.eye(3,3)
print(m)

In [ ]:
m[m>0]=4
print(m)

In [ ]:
old = np.array([[1, 1, 1],
                [1, 1, 1]])

new = old
new[0, :2] = 0

print(old)

In [ ]:
a = np.array([[2,3,4], [1,2,3]])
b = a * 1.0
c = a[1][1] * 1.0
c

In [ ]:
cvalue = [25.4, 24.8, 26.9, 23.9]
C = np.array(cvalue)
C

In [ ]:
(C*9/5 + 32)[0]

In [ ]:
round((C*9/5 + 32)[0],-1)

In [ ]:
np.array([1,4,9,15])//5.0

In [ ]:
np.arange(0, 10, 0.5, dtype = None)

In [ ]:
L, S = np.linspace(0, 50, num = math.pi, retstep=True)
S

In [ ]:
L

In [ ]:
x = np.array([[42,1,2],[1,2,3]])
x

In [ ]:
x[1][1]

In [ ]:
x%3

In [ ]:
x.ndim

In [ ]:
y = np.array([0., 1, 2, 3, 4., 5., 8, 13])

In [ ]:
print(y / 1.0)
print(y[1:4]*math.pi)
print(x[0, 1:])

In [ ]:
x = np.array([range(0, 51, 5),[x for x in range(0,255,25)],[True,True,True,True,True,False,False,False,False,False]]) 
s = x[0][:6]

In [ ]:
x[0][:6]

In [ ]:
s == x[0][:6]

In [ ]:
t = np.zeros((9))
t

In [ ]:
t = t.reshape(3,3)
t

In [ ]:
x = np.ones((3,3))
y = np.ones_like(x)
z = np.identity(5, dtype=float)

In [ ]:
x

In [ ]:
y

In [ ]:
z

In [ ]:
q = (np.eye(3,k=-1, dtype=int) + np.eye(3,k=+1, dtype=int)) * 4
q

In [ ]:
define = (3,3)
define

In [ ]:
rho = np.random.random(define)
rho

In [ ]:
print('The required random value is', "%.3f" % rho[2][0], 'to 3 d.p.')

In [ ]:
np.empty(define)

In [ ]:
print(z)
print(np.average(z))
print(np.median(z) < np.average(z))
print(np.std(z[0]))
print(np.max(z) * math.e)

In [ ]:
g = np.dot(q, z[:3,:3]) / 2.5
print(g)

In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(0.1, 2*np.pi, 10)
markerline, stemlines, baseline = plt.stem(x, np.cos(x), '-.')
plt.setp(baseline, 'color', 'r', 'linewidth', 2)

plt.show()

In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]:


In [ ]: